Forcasting COVID-19 Deaths on selected US locations

A simple model for the estimation of the maximum percentage of deaths due to COVID-19 in sevarl US Locations with more than 250 total deaths

============================================================================

NOTE:

  1. I am not an epidemiologist.
  2. The predictions are not official, nor backed up by any organization nor government.
  3. I am providing them to get an idea of when the COVID-19 peaks may occur.
  4. I will update the predictions after 7:00pm CT every day
  5. They will be based on CSSE at Johns Hopkins University data release.

“Disclaimer: Content from this website is STRICTLY ONLY for educational and research purposes and may contain errors. The model and data are inaccurate to the complex, evolving, and heterogeneous realities of different countries. Predictions are uncertain by nature. Readers must take any predictions with caution. Over-optimism based on some predicted end dates is dangerous because it may loosen our disciplines and controls and cause the turnaround of the virus and infection, and must be avoided.”

Statement copied from: https://ddi.sutd.edu.sg/

============================================================================

The prediction based on a simple logistic model and:

For US locations here is the basic assumption: 70% of SARS-CoV-2 exposure
20% Infection efficacy i.e. 20% of the exposure subjects turns into a Covid-19 case
2.0% of Mortality

I will estimate the death rates based on current trends fitted to the logistic function.
The last 14 days will be used for the peak estimations. Peak estimations will be done using:

If the data has not reached the peak the plots will show estimations based on:

  1. The last reported 17 observations
  2. 17 observations from the last week
  3. 17 observations ending two weeks ago
  4. 17 observations ending three weeks ago
  5. 17 observations ending four weeks ago

If the data reached the peak the plots will show:

  1. all the points after one week before the peak
  2. 17 observations before one week prior the peak
  3. 17 observations ending two weeks prior the peak
  4. 17 observations ending three weeks prior the peak
  5. 17 observations ending four weeks prior the peak

The flattening of the curve will be plotted as a grey line. It represents the maximum expected number of deaths at that specified date. If the maximum number of deaths is going down future looks good!

The peaks locations as red diamonds. The estimated peaks are based on information from dates before the actual peak. You can use them to get an idea of how reliable the estimations were.

I am also providing optimistic models. I´m assuming that the # of fatalities are 1/3 of the total expected.

=====================================================================================

Notes:

  1. If the current observations do not match the estimated CDF, do not trust the predictions.
  2. If predicted CDF does not reach 1.0, then there is the potential for more infection in the future or COVID-19 is not as deadly as I estimated

Source code:

https://github.com/joseTamezPena/COVID_Forecasting

# The expetec %  of deaths in each City

expectedtotalFatalities = 0.70*0.20*0.02
optGain <- 3
# The number of observations used for the trends
daysWindow <- 17

today <- Sys.Date()

currentdate <- paste(as.character(today),":") 

Loading the data

The data is the time_covid19ing set from CSSE at Johns Hopkins University:

https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_US.csv

Ploting some trends

Country.Region <- rownames(time_covid19US)
totaldeaths <- as.numeric(time_covid19US[,ncol(time_covid19US)])
names(totaldeaths) <- Country.Region
totaldeaths <- totaldeaths[order(-totaldeaths)]


ydata <- as.numeric(time_covid19US[names(totaldeaths[1]),])
ydata <- ydata[ydata > 1e-6]
plot(ydata,main="# Fatalities",xlab="Days",ylab="Fatalities",xlim=c(1,ncol(time_covid19US)))
text(length(ydata)-1,ydata[length(ydata)],names(totaldeaths[1]))

for (ctr in names(totaldeaths[1:30]))
{
  ydata <- as.numeric(time_covid19US[ctr,])
  ydata <- ydata[ydata > 1e-6]
  lines(ydata)
  text(length(ydata)-1,ydata[length(ydata)],ctr)
}


totaldeaths  <- totaldeaths[!is.na(totaldeaths)]
totaldeaths  <- totaldeaths[totaldeaths > 250]

Estimating the peaks